Mining Top-K Sequential Rules

نویسندگان

  • Philippe Fournier-Viger
  • Vincent S. Tseng
چکیده

Mining sequential rules requires specifying parameters that are often difficult to set (the minimal confidence and minimal support). Depending on the choice of these parameters, current algorithms can become very slow and generate an extremely large amount of results or generate too few results, omitting valuable information. This is a serious problem because in practice users have limited resources for analyzing the results and thus are often only interested in discovering a certain amount of results, and fine-tuning the parameters can be very time-consuming. In this paper, we address this problem by proposing TopSeqRules, an efficient algorithm for mining the top-k sequential rules from sequence databases, where k is the number of sequential rules to be found and is set by the user. Experimental results on real-life datasets show that the algorithm has excellent performance and scalability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TKS: Efficient Mining of Top-K Sequential Patterns

Sequential pattern mining is a well-studied data mining task with wide applications. However, fine-tuning the minsup parameter of sequential pattern mining algorithms to generate enough patterns is difficult and timeconsuming. To address this issue, the task of top-k sequential pattern mining has been defined, where k is the number of sequential patterns to be found, and is set by the user. In ...

متن کامل

An Algorithm for Mining Large Sequences in Databases

Frequent sequence mining is a fundamental and essential operation in the process of discovering the sequential rules. Most of the sequence mining algorithms use apriori methodology or build the larger sequences from smaller patterns, a bottom-up approach. In this paper, we present an algorithm that uses top-down approach for mining long sequences. Our algorithm defines dominancy of the sequence...

متن کامل

Pushing Constraints to Generate Top-K Closed Sequential Graph Patterns

In this paper, the problem of finding sequential patterns from graph databases is investigated. Two serious issues dealt in this paper are efficiency and effectiveness of mining algorithm. A huge volume of sequential patterns has been generated out of which most of them are uninteresting. The users have to go through a large number of patterns to find interesting results. In order to improve th...

متن کامل

A Quick Method for Querying Top-k Rules from Class Association Rule Set

Finding class association rules (CARs) is one of the most important research topics in data mining and knowledge discovery, with numerous applications in many fields. However, existing techniques usually generate an extremely large number of results, which makes analysis difficult. In many applications, experts are interested in only the most relevant results. Therefore, we propose a method for...

متن کامل

MARBLES: Mining Association Rules Buried in Long Event Sequences

Sequential pattern discovery is a well-studied field in data mining. Episodes are sequential patterns that describe events that often occur in the vicinity of each other. Episodes can impose restrictions on the order of the events, which makes them a versatile technique for describing complex patterns in the sequence. Most of the research on episodes deals with special cases such as serial and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011